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Abstract —In this paper, we investigate the optimization proh- 
lem of joint source and relay heamforming matrices for a two- 
way amplify-and-forward (AF) multi-input multi-output (MIMO) 
relay system. The system consisting of two source nodes and 
two relay nodes is considered and the linear minimum mean- 
square-error (MMSE) is employed at both receivers. We assume 
individual relay power constraints and study an important design 
problem, a so-called determinant maximization (DM) problem. 
Since this DM problem is nonconvex, we consider an efficient 
iterative algorithm by using an MSE balancing result to obtain 
at least a locally optimal solution. The proposed algorithm is 
developed based on QL, QR and Choleskey decompositions which 
differ in the complexity and performance. Analytical and simu¬ 
lation results show that the proposed algorithm can significantly 
reduce computational complexity compared with their existing 
two-way relay systems and have equivalent bit-error-rate (BER) 
performance as the singular value decomposition (SVD) based 
on a regular block diagonal (RBD) scheme. 

Index Terms: Two-way relay channel, MIMO, QL-QR 
decomposition, Choleskey decomposition, determinant max¬ 
imization, amplify-and-forward. 

I. Introduction 

Recently, wireless relay networks have been the focus 
of a lot of research because the relaying transmission is 
a promising technique which can be applied to extend the 
coverage or increase the system capacity. There are various 
cooperative relaying schemes have been proposed, such as 
amplify-and-forward (AF) HI and ||2l, decode-and-forward 
(DF) IJl, denoise-and-forward (DNF) ID, and compress-and- 
forward (CF) S cooperative relaying protocols. Among these 
approaches, AF is most widely used due to without detecting 
the transmitted signal. Therefore, an AF relay scheme requires 
a less processing power at the relays compared to other 
schemes. 

In one-way relaying (OWR) approach, to completely ex¬ 
change information between two base stations, four time slots 
are required in uplink (UL) and downlink (DL) communica¬ 
tions, which leads to a loss of one-half spectral resources l6|. 
In order to solve this problem, a two-way relaying approach 
has been considered in Q, 11 , and 0 . In a typical two- 
way relaying scheme, the communication is completed in two 
steps. First, the transmitters send their symbols to two relays, 
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simultaneously. After receiving the signals, each relay pro¬ 
cesses them based on an efficient relaying scheme to produce 
new signals. Then the processed signals are broadcasted to 
both receiver nodes. 

Multi-input multi-output (MIMO) relay systems have been 
investigated in [10]-[13]. It is shown that, by employing 
multiple antennas at the transmitter and/or the receiver, one can 
significantly improve the transmission reliability by leveraging 
spatial diversity. Relay precoder design methods have been 
investigated in ifT^ . ifTSl . ifThll . A problem in designing optimal 
beamforming vectors for multicasting is challenging due to its 
nonconvex nature. In m, the authors propose a transceiver 
precoding scheme at the relay node by using zero-forcing (ZF) 
and MMSE criteria with certain antenna configurations. The 
information theoretic capacity of the multi-antenna multicas¬ 
ting is studied in lITSl . along with the achievable rates using 
lower complexity transmission schemes, as the number of 
antennas or users goes to infinity. In ifThl . the authors propose 
an alternative method to characterize the capacity region of 
two-way relay channel (TWRC) by applying the idea of rate 
profile. 

Joint optimization of the relay and source nodes for the 
MIMO TWRC have been studied in 0, iflTl . In 0, the 
authors develop a unified framework for optimizing two-way 
linear non-regenerative MIMO relay systems and show that the 
optimal relay and source matrices have a general beamforming 
structure. The joint source node and relay precoding design for 
minimizing the mean squared error in a MIMO two-way relay 
(TWR) system is studied in ifTTll . 

Since singular value decomposition (SVD) and/or gener¬ 
alized SVD (GSVD) are widely used to find the orthogonal 
complement to solve an optimization problem 111, 0, ifT^ . 
||2^ . but their computational complexities are extremely high. 
In order to reduce the complexity, the SVD can be replaced 
with a less complex QR decomposition ifTSll in this work. 
However, this approach leads to degrading the BER perfor¬ 
mance. In addition, it is difficult to realize in TWRC. In this 
paper, we investigate the joint source and relay precoding 
matrix optimization for a two way-relay amplify-and-forward 
relaying system where two source nodes and two relay nodes 
are equipped with multiple antennas. Also, in order to apply 
the QL/QR decomposition to the TWRC, we design a three 
part relay filter. Compared with existing works such as [9]- 
[14], the contributions of this paper can be summarized as 
follows. Eirstly, we investigate a two-way MIMO relay system 
using the criteria which minimizes an MSE of the signal 
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Fig. 1. Proposed LQ-QR Amplify-and-Forward MIMO TWR system. 


waveform estimation for both two source nodes. We prove 
an optimal sum-MSE solution can be obtained as the Winer 
hlter while signal-to-noise-ratio (SNR) at both source nodes 
are equivalent IMI which leading to an MSE balancing result. 
Secondly, we propose a new cooperative scenario, i.e., the 
QL-QR compare with the Choleskey decomposition which 
signihcantly reduces the computational complexity of the 
optimal design. In this proposed design, the channels of its 
left side are decomposed by the QL decomposition while 
those of its right side factorized by the QR decomposition. 
And the equivalent noise covariance is decomposed by the 
Choleskey decomposition. We also design the three part relay 
hlter, which is comprised by a left hler, a middle hlter, and 
a right hlter, to efficiently combine two source nodes and 
the relay nodes. By these approaches, the received signals 
at both two source nodes are able to be redeemed as either 
lower or upper triangular matrices. Stemming from one of the 
properties of triangular matrices such that their determinant 
is identical to the multiplication of their eigenvalues, we are 
able to straightforwardly solve the optimization problem as a 
determinant maximization problem. Also, we can obtain the 
BER performance equivalent to that of SVD-RBD scheme. 

The rest of this paper is organized as follows. Section II 
describes a system model of the TWRC and raises a sum- 
MSE problem. In Section III, we propose an iterative QL-QR 
algorithm and a joint optimal beamforming design. In Sections 
IV, we discuss the computational complexity of an efficient 
channel model. The simulation results are presented to show 
the excellent performance of our proposed algorithm for the 
TWRC in Section V. Section VI concludes this paper. 

Notations: A* and denote the conjugate and the trans¬ 
pose of a matrix A, respectively. d(.) denotes a diagonal 
matrix and an A x A identity matrix is denoted by I^v. 
E(.) stands for the statistical expectation and (.)^ denotes a 
Hermitian transpose of a given matrix. tr{.) and R{.) denotes 
matrix trace and the range of a matrix. 

II. System Model AND Sum-MSE 

We consider a TWRC consisting of two source nodes Si 
and S 2 , and two relay nodes Ri and R 2 as shown in Eig. 
[T] The source and relay nodes are equipped with M and 
A antennas, respectively. We adopt the relay protocol with 
two time slots introduced in IITtII . In the hrst time slot, the 
information vector Xi £ is linearly processed by a 

precoding matrix transmitted to 


the relay nodes. The received signals at R^, i £ {1,2}, can 
be expressed as 

VRi = Hi.iSi + H 1 . 2 S 2 + 

2/^2 = H2,iSi -I-H2,2S2 + (1) 

where yn. £ £ {1,2}, indicates the received 

signal vector, £ ,i,j £ {1,2}, represents the 

channel matrix from source j to relay i, as shown in Eig.l, 
Si £ is the transmitted symbol vector from Si, and 

tiRi ~ C'A(0, ctIj Iat) represents the additive white Gaussian 
noise (AWGN) vector with zero mean and variance cr^ at 
relay node i. The term Si is subject to a power constraint, 
tr{E{sisf)} < Pi with tr{E{xixf)} < §Im, where P* is 
the transmit power at S^. 

To hnd an appropriate power normalization vector p^, we 
express the total transmission power at the source node with 
Vj as 

fr{ViVf+ V2Vf} = tr [p], (v? + V^ (V^)") } 

= tr{pl{P^+P2)). (2) 

In this paper, we assume that each transmit antenna satishes 
the unity transmission power constraint. To satisfy the power 
constraint, we propose the following power normalization 
vector 

PR = l/^Pl+P2- ( 3 ) 

In the second time slot, after power normalization, the relay 
node Ri linearly amplihes ya- with an A x A matrix 
and then broadcasts the amplihed signal vector xr. to source 
nodes 1 and 2. The signals transmitted from relay node i can 
be expressed as 

XRi=PR^tyRi- ( 4 ) 

Using (|T]) and (01), the received signal vectors at Si and S 2 

can be, respectively, written as 

2/1 = Hi",iFiHi,2Si + H{’iFiHi_2S2 + HJ,iF2H2,2Si 

-fH^_iF2H2,2S2 + H{^iFin/ij -f -f m 

2/2 = Hf^2FlHi4Si -f H{’2 FiHi_iS2 + HJ2F2H2,1S1 

-fH^2F2H2,iS2 + + ^^l2^2nR^ P n2, 

(5) 

where Hfj, i,j £ {1, 2}, indicates the M x N channel matrix 
from the relay node i to the source node j, and Ui, i £ {1,2}, 
is an M X 1 noise vector at S^. We assume that the relay 
nodes perfectly know the channel state information (CSI) of 
Hi j. The relay node R^ performs the optimizations of F^ and 
Vi, and then transmits the information to the source nodes 1 
and 2. Since source node i knows its own transmitted signal 
vector Si and full CSI, the self-interference components in 
(Is} can be efficiently cancelled. The effective received signal 
vectors are given by 

2/1 = Hi'4FiHi_2S2 + H^iF 2H2,2S2 + Hi^iFin/ij 

+H^iF2nfl2 + Ri 

= HiS2 + ni, (6) 
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Fig. 2. Examples of the SNR regions achieved in a TWRC with two relays. 


2/2 = +H^2F2H2,iSi 

+Hf 2F2''T-_R2 + "^2 

= H2Si+n2, (7) 

where Hi = HfiFiHi ^2 + H^iF 2 H 2,2 and H 2 = 
Hf,2FiHi,i+H^,2F2H2,i are the equivalent MIMO channels 
seen at source nodes Si and S 2 , respectively. The vectors 
ni = HfiFin^j +H^iF 2 ni {2 + ni and n 2 = Hf 2 Finfli + 
9F2n_R2 + 712 are the equivalent noises at source node Si 
and S 2 , respectively. 

Due to the lower computational complexity, linear receivers 
are applied at source node i to retrieve the transmitted signals 
sent from the other nodes. The estimated signal waveform 
vector is given as sj = 'Wfyi, where is an M x M 
weight matrix, with i = 2 for 7=1 and 7 = 1 for 7 = 2. 
From (|6]l, the MSB matrix of the signal waveform estimation 
denoted by MSEj = E[(si — Si)(si — which can be 

further written as 

MSE, = (WfH,-lM)(WfH,-lM)^ 

+WfC„^W, (8) 

where = H^,F,FfH*, + H-^E-Ff+ Im is the 
equivalent noise covariance. The sum-MSE of the two source 
nodes in the proposed system model can be written as: 

MSE,„^ = MSEi + MSE 2 . (9) 

Note that the sum-MSE minimization criterion measures the 
overall transmission performance of both the DL and the 
UL. Since the two data streams are transmitted at different 
directions during the two time slots are considered in the TWR 
network. 


III. Joint Source and relay Beameorming Design 

In this section, we develop an iterative QL-QR algorithm 
by using the MSE balancing result. The QL-QR algorithm 
involves two steps, i.e., the linear receiver matrix optimization 
and the joint source and relay beamformer design. 


A. Proposed Optimal Detector and Optimization Problem 
We would like to find the jointly optimal beamforming 
vectors W^, and F^ such as the following sum-MSE is 
minimized 


min MSEs„^. 

Wi,W2,Fi,F2,Vi,V2 


( 10 ) 


According to we consider the following transmission 
power constraint at relay node 


fr(FiDiFf + F2D2Ff) < Pr, + Pr, = Pr, (11) 

where Di = p^(Hi,iViVf Hfi+Hi, 2 V 2 V|^Hf 2 +Iiv) and 
D 2 = p|(H 2 ,iViVf -f H 2 . 2 V 2 Vf H ^2 + In). The Pr, 
denotes the power constraint at the relay node R^, and Pr 
is the total relay power. The transmission power constraint at 
two source nodes can be written as 


iT(V,Vf)<P„ 7 = 1,2 (12) 

where Pi is the available power at the zth source node. Ac¬ 
cording to (fTOb , (fTTIt and (fT^ . the joint optimization problem 
of the sum-MSE can be formulated as follows: 


min 

Wi,W2,Fi,F2,Vi,V2 

s.t. 

S.t. 


MS'Esum 

fr(FiDiFf -f F2D2Ff) < Pr 
fr(V,Vf) < P,. (13) 


It is shown in ll20l that at the optimum, SNRi = SNR 2 holds 
true, thus leading to an SNR balancing result. Otherwise, if 
SNRi > SNR 2 , then P 2 can be reduced to retain SNRi = 
SNR 2 , and this reduction of P 2 will not violate the power 
constraint, i.e.. 


Pi ■ SNRi = P 2 • SNR 2 . (14) 


In Eig. |2] we show two examples of the SNR regions with 
ai = 0.5 and a 2 = 0.3, where Wi G [0,1] is a Lagrange 
multiplier weight value and ai G [0,1] is an SNR weight 
value. We have assumed the sum of SNR is a constant value. 
It is clear that the SNR region of ai is larger than that of 
a 2 - Lor further details, see ll20l . As discussed in El], the 
optimization problems have the performance matrix that are 
functions of SNR , namely the MSE at the output of a linear- 
MMSE (LMMSE) filter of each user 


MSE = 


1 

1 + SNR' 


(15) 


By these two approaches, the max-min optimization problem 
in d can be efficiently written as 


min MSEi (16) 

Wi,Fi,V2 

S.t. fr(FiDiFf+F2D2Ff) <Pfl (17) 
s.t. MSEi = MSE 2 , (18) 

where i G 1,2. Since the optimization problem (fThl l is non- 
convex, it is difficult to obtain the globally optimal solution. In 
this paper, we present a locally optimal solution of the joint 
optimization problem over W^, and E^ where 7 = 1,2, 


which can be solved by three stages, i.e., 1 : The linear 
receiver weighted matrices are optimized with the fixed source 
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precoding matrix and relay amplifying matrices (Wi is 
not in constraints ( fTTI i and (fTSll). 2 : With given and fixed 
Fi, update V^. 3 : With given and V^, obtain suboptimal 
Fi to solve (fTSl l. 

Lemma 1 : For any fixed and F^, the minimization 
problems in (fTSl l are convex quadratic problems and the 
optimal Wi can be obtained as the Wiener filter which is 
used to decode Si shown as follows 

W° = (H,Hf + (19) 


Proof: For source node z, the MSE can be further expressed 
as: 


MSEi = Wf H,Hf W, - Wf H, - Hf W, 

+lM + WfC„.W, (20) 


Based on (l20l l. the derivation of an optimal MSE detection 
matrix is equivalent to solving the following equation; 


5MSE, 

awf 


= 2H,Hf W, - 2H, + 2C„,W, = 0. 


( 21 ) 


Then, we may obtain closed-form solution of W^, which is 


W° = (H,Hf + (22) 


with the problem (|25] > is given by 

2 

Lv = <r[lM+Vf$V 2 ]”'+y]A^,(fr{VfVJ-P,) 

+M3 {fr{p?j(Vf ^iVi -f Vf'®'2V2)} - Pr} , (26) 
where Hi > 0 is the Lagrange multiplier. 

Case 1: When = 0, making the derivative of L^/ with 
respect to V 2 be zero, we obtain 

^ - [Im + Vf $V2] Vf $ = 0. (27) 

0X2 

Since V 2 and $ are nonsingular matrices, (fZTl i can be repre¬ 
sented as 

lM+Vf#V2 = 0. (28) 

Simplifying (ESll, Im > 0 and Vf > 0. Consequently, 
in Case 1, the optimal solution is not existent. 

Case 2: When fii > 0, we rewrite the Lagrangian function 
as 

Lv = [Im+V^$V2] — fJ-lPl — tJ-2P2 — (J^^Pr 

+vf TT^V 2 +/r 3 / 0 flVf 4^1 Vi -fVf Vi,(29) 


This completes the proof. 

With the optimal W° fixed, the outer minimization problem 
in (O can be rewritten as 

min MSE° 

Fi,F2,Vi,V2 

s.t. fr(FiDiFf -f F 2 D 2 Ff) < Pr 

s.t. MSEi = MSE 2 , (23) 

where MSE° is the MSE matrix using W°. By substituting 
(O into (Eli, we have 

MSE° = [Im + Hf C-ifii]-!. * = 1,2. (24) 

Note that the matrix inversion lemma is used to obtain (l24ll . 


B. Joint Optimal Source and Relay Beamforming Matrices 
Design and Iterative Algorithm 

In this section, we focus on the source and relay beam¬ 
forming matrices design and develop an iterative algorithm 
which is suboptimal for the general case, but has a much 
lower computational complexity. Eor the fixed F^, the source 
precoding matrix is optimized by solving the following 
problem 


min 

Vi,V2 

tr [lM+Vf$V 2 ] 

s.t. 

2r{p^(Vf^iVi -f Vf'®'2V2)} < Pr 

s.t. 

V,} < (25) 

where ^ = 

HfCr'Hi , = HfiFfFiHi,! + 

urH hj 

■*^2 1-*^ 2 2^2,1 

, and ’F 2 = Hf 2 FfFiHi ,2 + 

TLTif TT'Hta -pr 
-0.2,2-*^ 2 ^ 2^2,2 

The Lagrangian function associated 


where = /i 2 lM + 2 - We obtain the derivative of 

Lv as 

= - [Im + Vf $V2] Vf $ + Vf TT^ = 0. (30) 

aV2 

Since and $ are nonsingular matrices, multiply both sides 
by (V|^) and we have 

(Vf)”^ [lM+Vf$V2]”"vf = TT^$-\ (31) 

Due to is Hermitian and positive definite, we apply the 
Choleskey decomposition of $ = where 12 is a lower 

triangular matrix. Consequently, we represent ED as 

(12^)”^ (Vf)”^ [lM+Vff2^f2V2]“"vff2^ 


= (f2^) (32) 

By the definition of the matrix identity as 

[lM + XX^]”^X = X[ljv-f X"X]"\ (33) 

for any M x N matrix X, we can rewrite (l3^ as 

[Im + f2V2Vf 12^] = (12^) (34) 

Solving (iTSl l for V 2 , we obtain (l32l l as 

V2= (35) 


where V = ^. Obviously, the precoding matrix Vi 

can be obtained in the same way. 

Figure 3 shows our proposed relay filter design, which 
forwards the received signal (input) from Si amplified by a 
Left Filter (LF) matrix F/, ^ and the signal from S 2 amplified 
by a Right Filter (RF) matrix F^; ^ to the Center Filter (CF) 
Fd i that amplifies the outputs from the Left Filter (LF) matrix 
Fl i and the Right Filter (RF) matrix F/j i, and forward them 
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distribution as U, i.e., in (l36l l. 

|Fi| = = iF^i^l 

IF 2 I = |FL_2F_D_2Fi{_2| = |F£)_2|- (41) 

Now, let us introduce the following QL decompositions 

[Hi4Vi,H2,iVi] = [QiaLi,QL,2L2], (42) 


Fig. 3 . The relay filter design of the proposed QL-QR technique. 


to Si and S 2 (output)Q 

Lemma 2: The optimal relay filter constructive of Fl, Ffl; 
and Fu matrices i.e., for Ri and R 2 can be designed as 

Fl,1 = Ql,1, Fl,2 = Ql,2, 

Ffl.i = Qf^i, F_r, 2 = Qf,2: 

F£)4 and F£i^ 2 are diagonal matrix. ( 36 ) 

Proof: For YL,i and F/j i, the proof is similar as Theorem 3.1 
in II 22 I . For Fd.i, using Theorem 2 in a, the structure of 
FD,i is optimal for the two cases of 

(а) : i?(HiVi)±i?(H2V2);(H*)±R(H;) 

and R(H3Vi)_Li?(H4V2);(H*)_Li?(H^) 

( б ) : R(HiVi)||R(H 2 V 2 );(HD±R(H;) 

and R(HiVi)||i?(H2V2); (H*)±i?(H;). ( 37 ) 


Case a:lf N = 2, the optimal F^ ^ is a diagonal matrix given 
as 


F 


D,i 


A 


fd^.i 0 
0 id,^,2 


(38) 


If iV = 2a, a = 2,3,..., the optimal Fjy^i is a 2 x 2 block 
diagonal matrix given as 


F 


D,i 


A 


Fo.i,! 0 

0 FD,i,2 


where F 45 .i i and Fo.i ,2 are "y x ^ matrices. 
Case b: The optimal F^.i is defined as 


F 


★ 

D,i 


A 




(39) 


(40) 


Discussion: In Case b, since FJ^ ^ is optimal, but the compu¬ 
tational complexity will be considerably increased compared 
with Case a, so we exclude it. 

For Case a. Before we develop a numerical method to solve 
vector Fu.i, let us have some insights into the structure of 
this suboptimal relay beamforming matrix. To simplify relay 
beamforming matrix Fu.i, we introduce a following property; 

Property 1: The statistical behavior of a unitary matrix U 
remains unchanged when multiplied by any unitary matrix 
T independent of U. In other worlds, TU has the same 


*For example: For Si, the equivalent channel can be written as 
Hi = Hf.iFiHi.a + H|;iF 2H2,2 = Hf;iFi,iF4,,iFfl,iHi,2 -F 

^ 2 , 1 ^L, 2 ^D, 2 Tr. 2^2,2- For S2, the equivalent channel can be written 
as H2 ’= Hf jFiHi.i -FH|;2F2H2 ,i = Hf jFfi.iFc.iFi.iHi.i -F 
^^ 2 !iFb, 2 Fd, 2 Fl, 2 H 2 , 2 - 


where Qi.i for z = 1 , 2 , is a unitary matrix with a dimension 
(^AfxJV, {Li, L 2 } € are lower triangular matrices. 

Similarly, let us introduce another decomposition, namely, QR 
decomposition as 

[Hi, 2 V 2 ,H 2 . 2 V 2 ] = [Qfl,lRl,Q4J.2R2], (43) 

where Qr^ € (J^a^xm j = 1 ^ 2 , is a unitary matrix, and 
{Ri, R 2 } G C^t^xM upper triangular matrices. Substitut¬ 
ing (l42l l. (l43l l. and (l36l l back into I©, i.e,. for Si, we may get 
equivalent received signals shown as, 

yi = (LfFii.iRi-F L^F£)_ 2 R- 2 )d :2 

+F{F opnR.^ +L^F£)_ 2 fT-fl 2 +^^2 
= Hi, 2 a; 2 +ni, (44) 

where Hi = LfF^.iRi -F L|’F 45 . 2 R -2 and rii = 
L^F 45 .ini 4 j-FL^F 45 . 2 ttfl 2+^2 are efficient channel and noise 
coefficients, obtained from the covariance of ni, we have 

Cl = nirii 

= Lf Fii.iFg ^LJ -F FlF d,2F%^2^1 + (45) 

For fixed Vi and V 2 , using (l44l i and Property 1, the optimal 
problem (l23t becomes 

max fr fliv-FHfC7^Hi') (46) 

s.t. fr(F|^ iDiF£)_i-FF|^ 2 O 2 FD. 2 ) <-Pr- (47) 

Then, (l46T l can be represented as 

fr ((HfCr^Hi)+n) , (48) 

where the lemma fr(A + B) = tr{A) +tr{'B) has been used. 
Since the matrix Ci is Hermitian and positive definite, we can 
decompose this matrix using Cholesky factorization as 

Cl = HfHi (49) 

where Hi denote a lower triangular matrix. By substituting 
( |49] | back into (l48T l. we can simply rewrite the optimal problem 
as 

max (MSE^)”^ 

= max fr ^(Hf (Hf Hi) ^Hi)-Fn^ 

= max tr Hj"^) j 

= max tr (BiBf^) , (50) 

where = denotes n has nothing to do with the maximum 
solution and Bi = Thus, the optimal problem can 

be represented as the determinant maximization of |Bip. 

In Case a, since Fr^ the block diagonal matrix, its 
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determinant can be written as 

detFu i = detFu • detFu 

Let A, B, C, and D be an ^ x ^ 


(51) 


matrix. We can define 


detFu.i^i, for i € 1,2 as, 
dotF 0,i,i — 




A 
B 
A 
B 

= |A||C 

N 


D 

C 

0 

I 

N 


I 
0 

BA 


C 

^D| 


A-i 

BA 


(52) 

to 

^D. 


where I stands for an ^ x ^ identity matrix. In 
obtain maximum detF^i i ^, we should minimize BA 
Let us introduce the SVD of B, A, and D as 

B = UsSijAf, A = Va^aA-a, D = (53) 

where Ui, A^, i G {A, B, D}, are the unitary matrices, and 


(b) 


Fig. 4. The extended system model, (a): The K pair source nodes scenario, 
(b): The T relay nodes scenario. 


Fd.i, Ri, L 2 , Fd. 2 , and R 2 , respectively. Since Hi and Hj^ ^ 
are also lower triangular matrices, we have 

detBi = detHgdetHj”^ 


— n(''l + 


(57) 


S, is an ^ ^ 


/N ^ diagonal matrix. Substituting (l53]) back into 


D, we 

have 




max 

min 

tr (UsSsAf (U^S^Af) UnSiiAgj 

^ D,l,^ D ,2 

min 

tr (Ss 

S.t. 

min 

Vb*, ,54) 

S.t. 


hi ,di 




where bi, di, and at are the diagonal elements of Ss, Sd 
and S. 4 , respectively. To simplify our discussion, we assume 
Fu i i is a semi-positive matrix, thus, we have the minimum 
solution as bidi — 0. Interestingly, if both bi and di are 0, 
Fu i i is a diagonal matrix. Otherwise, it is a lower/upper 
triangular matrix. In addition, for Si, the equivalent channel 
Hi, since the terms Lf, Lg, Ri, and R 2 are upper triangular 
matrices, the optimal Fu i should be an upper triangular 
matrix. Since the equivalent channel H 2 , Li, L 2 , Rg, and 
Rg are lower triangular matrices for S 2 , the optimal F^i i is 
a lower triangular matrix. Therefore, if and only if F^ ^ is 
a diagonal matrix, the sum-MSE is optimal in our proposed 
method. This completes the proof for Lemma 2. 

Property 2: For any M x N rectangular matrices G and J, 
matrices A and B are lower/upper triangular matrices based 
on QR or QL decomposition of G and J. If <li,i + bi^i ^ 0, 
where Qi^i and 6^ ^ are diagonal elements of matrices A and 
B, respectively, we can easily obtain 

m 

det (A + B) = -I- bi^i) > detA -b detB, (55) 

i=l 

Consequently, we have 

m 

detH]^ = (^1,2,i/z),1,2^1,2,Z “b ( 2 , 2 , 2 / 1 ?, 2 ,2^2, 2 , 2 ) 

2 = 1 
m 

= J7(bl+C2), 


where ^i is the diagonal element of Hj^ Now, the optimiza¬ 
tion problem can be reformulated as 

|Bi|2 (58) 


|Bi|2 \ 

In Iiv + Hr'HfHi(Hr')7 


It is clear that (l58Tl-(l60t is a convex problem for beamformer 
vectors F^ 1 and F^ 2 , which can be efficiently solved by the 
interior-point method 1^ . 

In summary, we outline the iterative beamforming design 
algorithm as follows (QL — QR Algorithm): 

Algorithm 1 QL-QR Algorithm 

1. Initialize: F-”\ W-"\ V-"g for i,j = 1,2, set 

n = 0; 

2. Repeat: 

1 : for n ■(— n -b 1 do 

2: for given and update using 

( 22 ); 

3: for fixed F-” W-"^ update using (35); 

4: decompose v[”^) using QL decomposi¬ 

tion as (42); 

5: decompose (H^”2V2”\ H2”2 V^”^) using QR decompo¬ 
sition as (43); 
compute c(" , 

decompose using Cholesky decomposition as (49); 
for fixed Hi, compute detB^"^ using (57); 

Until MSEi converges. 

end for 


(56) 


where ci = ^i,i,i/n,i,iri,i,i, C 2 = h,i,ifD, 2 ,iT' 2 y,i, li,iy, 

fD, 2 ,i, and r 2 ,i,i are diagonal elements of Li, 


Since in the QL-QR algorithm, the solution of each sub¬ 
problem is optimal, we conclude that the total MSE value is 
decreased as the number of iterations increases. Meanwhile, 
the total MSE is lower bounded. 

Discussion: The extended two system models are shown 
in Fig. m which are the multipair scenario with two relay 
nodes and K pair source nodes, and the Z (Z should be even 
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number) relay nodes scenario with two source nodes. In Figure 
4(a), each pair sources and two relay nodes can be seen as 
a group. Since each pair source nodes are independent with 
each other, we can design that there are K RFs , LFs and 
one CF equipped at each relay nodes. Therefor, the extended 
system model (a) can be seen as K parallel of our proposed 
system model. In Figure 4(b), two source nodes and every two 
relay nodes can be seen as one group. Obviously, the extended 
system model (b) can be seen as ^ parallel of our proposed 
system model. 

IV. Computational Complexity Analysis 

In this section, we measure the performance of the 
proposed QL-QR scheme in terms of the computational 
complexity compared with existing algorithms by using the 
total number of floating point operations (FLOPs). A flop 
is defined as a real floating operation, i.e., a real addition, 
multiplication, division, and so on. In Il26ll . the authors 
show the computational complexity of the real Choleskey 
decomposition. For complex numbers, a multiplication 
followed by an addition needs 8 FLOPs, which leads to 4 
times its real computation. According to the required 

number of FLOPs of each matrix is described as follows: 

1. Multiplication of m x n and n x p complex matrices: 
8mnp — 2 mp; 

2. Multiplication of m x n and n x m complex matrices: 
4nTO X (to + 1); 

3. SVD of an TO X n{m < n) complex matrix where only S 
is obtained: 32{mn^ — n^/3); 

4. SVD of an TO X n{m < n) complex matrix where only S 
and A are obtained: ?,2{nim? + 2m?), 

5. SVD of an TO X n{m < n) complex matrix where 17,S, 
and A are obtained: 8(4n^TO + 8nm? + 9to^); 

6. Inversion of an to x to real matrix using Gauss-Jordan 
elimination: 2m^ — 2w? + to; 

7. Cholesky factorization of an to x to complex matrix: 

8toV3. 

8. QR or QL decomposition of an to x n conplex matrix 
16 (n^TO — nm? + 

For the conventional RED method ll28l . the authors con¬ 
sider a linear MU-MIMO precoding scheme for DL MIMO 
systems. For the non-regenerative MIMO relay systems Il29ll . 
the authors investigate a precoding design for a 3-node MIMO 
relay network. In IS), a relay-aided system based on a quasi- 
EVD channel is proposed. We compare the required number of 
FOLPs of our proposed method with conventional precoding 
algorithm, such as the conventional RED, the non-regenerative 
MIMO relay system, and the CD-ED algorithm as shown in 
Tables I, II, III, and IV, respectively, under the assumption that 
Nt = Nji and Ni = N’Y' — Ni. 

For instance, the (2, 2,2) x 6 case denotes a system with 
three users [K = 3), where each user is equipped with two 
antennas [Ni = 2) and the total number of transmit antennas 
is six {Nt = 2x 3 = 6). The required number FLOPs of the 
QL-QR algorithm, the conventional RED, the non-regenerative 



Fig. 5. The complexity comparisons for required FLOPs versus the number 
of the users K. 



Fig. 6. The complexity comparisons for required FLOPs versus the number 
of the receive antennas Ni for each user. 


MIMO relay system, and the CD-ED algorithm are counted 
as 33530, 40824, 45306, 34638, respectively. From these 
results, we can see that the reduction in the number of FLOPs 
of our proposed precoding method is 17.87%, 25.99%, and 
3.20% on an individual basis compared to the conventional 
RED, the non-regenerative MIMO relay systems, and the CD- 
ED algorithm. Thus, our proposed QL-QR algorithm exhibits 
lower complexity than conventional algorithms. In addition, 
the complexity reduces as Ni and Nt increase with fixed K. 

We summarize our calculation results of the required num¬ 
ber of FLOPs of the alternative methods in Tables I, II, 
III, and IV and show them in Figures and |6] Figure |5] 
shows the computational complexity where Ni = 2 and a 
value of K varies. And Figure |6] shows the computational 


















































































complexity where itT = 4 and a value of Ni varies. For 
the conventional RED method, the orthogonal complementary 
vector Vfc 0 requires K times SVD operations. If only o 
is obtained, it is not computationally efficient. In step 5, the 
efficient channel Heff = HiP“ is decomposed by the SVD 
with a dimension Re// x Nt, where Re// is the rank of He//. 
In the nonregenerative MIMO relay method and the CD-BD 
algorithm, two SVD operations are performed for the channels 
from the source to relay and from relay to the destination, and 
then the efficient channel covariance matrix is measured. In the 
nonregenerative MIMO relay method, the authors compute A 
using the EVD an then they diagonalize G. In the CD-BD 
algorithm, the authors calculate V“ by the SVD of and 
then they structure Vj by using the Choleskey decomposition. 

In our proposed QL-QR algorithm, we take advantage of 
QL and QR decompositions instead of the SVD operation, and 
then we compute an efficient channel as well as decompose 
a noise covariance matrix by the Choleskey decomposition. 
Finally, we calculate the determinant of B? to solve an 
optimization problem. Obviously, our proposed QL-QR algo¬ 
rithm outperforms conventional algorithms in the light of the 
computational complexity. 

V. Simulation Results 

In this section, we study the performance of the proposed 
QL-QR algorithm for two-way MIMO relay networks. All 
the simulations are performed on the assumption that all 
the channels are the Rayleigh fading channel and they are 
independently generated following ~ CN{0,1). The noise 
variances af are equally given as The total relay power 
constraint can be written as 

Pri + Pr2 = o,Pr + (1 — a.)PR = Pr, (61) 

where a G [0,1] is an auxiliary value as well as a power allo¬ 
cation coefficient between two relay nodes. All the simulation 
results are averaged over 1000 channel trials. 

In Fig. |7] we compare the sum mutual information (SMI) 
of various MU-MIMO schemes where full CSI is known at 
each node. We set Pi = P 2 = 10 dB, M = 1, and an equal 
power budget for the two relays (a = 0.5 is assumed). The 
negative SMI is adopted in ifThl which can be defined as 

MU™ = log 2 IMSEil + log 2 IMSE 2 I. (62) 

In our proposed method, the SMI shown in the simulation 
results is calculated as —2 log 2 |B^| by using (|4^ . (l5^ . and 
(I 57 I 1 . It can be observed that the proposed QL-QR algorithm 
has the same SMI performance as an optimal solution in fTh). 

Figure 0 shows the performance of our proposed SMI 
performance versus the number of the relays, T which is even. 
We consider a practical scenario with different relay power 
constraints. We set Pr = 30 dB and a = 0.5. It is clear that, 
for different values of Pi and P 2 , a solution of our proposed 
QL-QR algorithm shows better performance than a max-power 
solution. 

Figure |9] exhibits the BER performance of the BD water 
filling, the RBD, the SVD-RBD, and our proposed QL-QR 
method, where the QPSK modulation is made use of. As 



Fig. 7. The achieved SMI for = 4, 2. 



Fig. 8. The SMI versus the number of the relays T. 


pointed out in IfSTII . the BER performance for a MIMO 
precoding system is actually determined by the energy of 
the transmitted signal. To simplify our discussion, we assume 
a = 0. In the RBD, det = lll^i where 

H e for M < N, is an equivalent channel matrix 

with its eigenvalues A^. In our proposed QL-QR method, for 
source node Si, we have det ‘^i- Under 

the stipulaton that detFu/ = 1, we are able to easily obtain 
Xi = Therefore, our proposed QL-QR method has the same 
BER performance as the SVD-RBD method. 

VI. Conclusions 

This paper studies joint optimization problem of an AF 
based on the MIMO TWRC, where two source nodes exchange 
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TABLE I 

Computational complexity of the proposed QL-QR Algorithm. 


Step 

Operations 

FLOPS 

Case: (2, 2, 2) x 6 

1 

Vi,V2 

2 X K{mN'f - 2m'f + 177V,) 

1560 

2 

Ql.iLi, Ql,2L2 

2 X IQK ^ N^Ni - NrN'i + ^ N ' f ) 

4864 

3 

Qrt.iR-ii Qi2,2R2 

2 X 16K { N^Ni - NrN'i + iN ' f ) 

4864 

4 

HuiFiHi2 

m'^Ni + mTN't + 27Vrk 

696 

5 

H'^;iF2H2,2 

+ 4iVT^^ + 27VriVi 

696 

6 

Cl 

2K { 32N^Ni + 8NtN^ + 2N ^ - iNi + ^Nt) 

14856 

7 


K{fNi^-2N^ + NT ) 

2826 

8 

detBj 

AK{N^ + N^ + 2Nt) 

3168 

Total 



33530 


TABLE II 

Computational COMPLEXITY of the nonregenerative MIMO relay system l29l . 


Step 

Operations 

FLOPS 

Case: (2,2,2) x 6 

1 


8K{4:N^Ni + + 9N^) 

13248 

2 

UfSfAp 

8K{4N^Ni + 8NTN'f + 9Nf) 

13248 

3 


AKNiNriNi + 1) 

432 

4 

H-H, 

AKN^Nrim + 1) 

432 

5 

Hf[a^a2^(H,F)"H,F + I]-^H, 

2K{N'i + 8NiN^ + 4 N'^Nt + 2N,Nt - N'f + Ni) 

4212 

6 

VaAaV^ 

8K{AN^Ni + 8 NtN'^ + 9iV^ + ^Ni) 

13272 

7 

cliag(G) 

KlAN^NriN, + 1) + 2Nf - 2Nf + TV,] 

462 

Total 



45306 


TABLE III 

Computational complexity of the conventional RBD l28l . 


Step 

Operations 

FLOPS 

Case: (2, 2, 2) x 6 

1 


82K{NtN] + 2TV-) 

21504 

2 

((S“fS“ + p2i)-i/2 

K{l8NTNf - 2Nf) 

336 

3 

1 1 

8KN^ 

5184 

4 

H,P“ 

K{8NtN'^ - 2N't) 

552 

5 

ufsfv™ 

MK{^Nf + NTN't + ^N'^N,) 

13248 

Total 



40824 


TABLE IV 

Computational COMPLEXITY of the CD-BD algorithm (2). 


Step 

Operations 

FLOPS 

Case: (2, 2, 2) x 6 

1 


8K{4N^N, + 8NtN'^ + 9N't) 

13248 

2 

Af2Si,2U,,2 

8A(4A^A, + SNTN'f + 9Nf) 

13248 

3 

H,,2WH,i 

K[8NiN'^ - 2NiNT + iNiNr x (A, + 1)] 

2088 

4 


2K{N, + 2NtN, X (A, + 1) + 4A]'/3) 

508 

5 

Kf 

mse 

4A|/3 + 12A^At - 2A| - 2ATAfl 

2736 

6 


8A[4AtA^ - 4A^/3 + N't{N, + 1)] 

2336 

7 

(Q.Qf 

K[ANrN, X (A, + 1) + 3A, + 2N'f - 2Nt] 

474 

Total 



34638 
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Fig. 9. BER performance on the Rayleigh fading channel . 


their messages with two relay nodes. A relay filter has been 
designed, which is able to efficiently join the source and 
the relay nodes. Our main contribution is that the optimal 
beamforming vectors can efficiently be computed using deter¬ 
minant maximization techniques through an iterative QL-QR 
algorithm based on a MSE balancing method. Our proposed 
QL-QR algorithm can significantly reduce the computational 
complexity and has the equivalent BER performance to the 
SVD-BD algorithm. 
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